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Abstract. In this paper, we provide upper bounds on several Rubinstein- 
type distances on the configuration space equipped with the Poisson measure. 
Our inequalities involve the two well-known gradients, in the sense of Malli- 
avin calculus, which can be defined on this space. Actually, we show that 
depending on the distance between configurations which is considered, it is 
one gradient or the other which is the most efi^ective. Some applications to 
distance estimates between Poisson and other more sophisticated processes 
are also provided, and an application of our results to tail and isoperimetric 
estimates completes this work. 



1. Introduction 

Let A be a cr-compact metric space and Fa be the space of configurations on 
A equipped with a Poisson measure /.t. Defining and evaluating some distances 
between probabihty measures on Fa is an important problem, both theoretical 
and for applications, since it is equivalent to defining distances between point 
processes (see for instance Chapters 2 and 3 of [17] for a thorough discussion and 
references about this topic). Among the large class of distances one may consider, 
the one we want to study relies on an optimal transportation problem. Letting p 
be a lower semi-continuous distance on Fa and two configurations uj,rj ^ Fa, we 
understand the quantity rf) as the cost for transporting one unit of mass from 
w to 77. Hence the optimal transportation cost between /i and some probability 
measure v on Fa is given by 



Tp{p,iy)= inf / / p(t^,77) d7(w,?7), 

where v) is the set of probability measures on Fa x Fa with marginals fi and 
v. Such a quantity is called the Rubinstein distance between fi et v. Being defined 
by a variational formula, its explicit expression is of difficult access in general but 
might be estimated from above: the construction of any coupling between fi and 
v yields a bound on the Rubinstein distance between /i and v. In particular, a 
convenient upper bound ensures its finiteness, which is not guaranteed a priori. 
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Another interesting property of Tp is its rich duahty. More precisely, the Kanto- 
rovich- Rubinstein duaUty allows us to rewrite the Rubinstein distance as 



where p — Lipj^ denotes the sot of 1-Lipschitz functions on Fa with respect to 
the distance p. This means that Tp depends crucially on the distance on the 
configuration space as it changes the set of Lipschitz functions, hence incorporates 
a lot of information on the geometry of Fa- Using the dual definition of the 
Rubinstein distance instead of the original one can be very relevant in some cases. 
Given a probability measure v with density L with respect to the Poisson refe- 
rence measure /u, our purpose in the present paper is to control from above the 
Rubinstein distance Tp{p,, u) in terms of convenient (and easily computable) quan- 
tities involving the density L. Such inequalities belong to the domain of functional 
inequalities, which is by now a wide field of research with numerous methods of 
proofs. See for instance the very complete monograph [18] and particularly Chap- 
ters 21 and 22 for a large panorama on this topic, with precise references and 
credit. 

To derive our inequalities, the two main ingredients at work are other representa- 
tions of the Rubinstein distance and the Rademacher property. On the one hand, 
such representations can be obtained either by embedding the two probabi- lity 
measures into the evolution of a Markov semi-group, or by using the so-called 
Clark formula. On the other hand, the Rademacher property formally states that 
given a distance p, there exists a notion of gradient such that its domain contains 
the set p — Lip^ and any function in p — Lip^ has a gradient whose norm is less 
than 1, i.e., that we can proceed as in finite dimension. 

For these two steps, we need a notion of gradient. In the setting of configuration 
spaces, such a notion does exist within the Malliavin calculus. In fact, we even 
have two notions of gradient: a "differential" gradient (see [1, 15]) and a gradient 
expressed as a finite difference operator (see ]13]). We show that depending on 
the distance p chosen on the configuration space, one gradient or the other is more 
convenient, i.e., the Rademacher property holds with one notion of gradient, or 
the other. 

The paper is organized as follows. After the preliminaries of Section 2, we provide 
in Section 3 various upper bounds on the Rubinstein distance 7^(/i, z/), where p 
is the total variation distance, the Wasserstein distance or the trivial distance on 
the configuration space Fa- Based on a semi-group approach, the first abstract 
upper bound involves the gradient associated to our given distance p in the sense 
of the Rademacher property. When dealing with the total variation distance on 
the one hand, such an estimate has a simplified expression, contained in our first 
main result. Theorem 3.2, which can be retrieved by using an alternative method, 
namely the Clark formula. On the other hand, when the configuration space is 
equipped with the Wasserstein distance, the upper bound we give in our second 
main result. Theorem 3.4, relies on a time-change argument together with the Gir- 
sanov Theorem. Finally, the last Section 4 is devoted to numerous applications of 
these two inequalities: by choosing the probability measure u as the distribution 
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of a given process, we are able to estimate froni above distances between Poisson 
processes, between Poisson and Cox processes, between Poisson and Gibbs pro- 
cesses, etc. We thus hope to give a systematic treatment of the various situations 
one may encounter in appHcations. We conclude this work by providing another 
consequence of Theorem 3.2 to tail and isoperimetric estimates. In particular, we 
obtain sharp deviation inequalities for the total variation distance and also a new 
estimate of the classical isoperimetric constant, which is asymptotically sharp as 
the total mass of A is small. 

2. Preliminaries 

Let X be a Polish space and p a lower semi-continuous distance on X x X, which 

does not necessarily generate the topology on X. Given two probability measures 
jj, and v on X, the optimal transportation problem associated to p consists in 
evaluating the distance 

Tp{p,iy)^ inf / / p{x,y) d-fix,y), (2.1) 

where i^) is the set of probability measures on X x X with first (respectively 
second) marginal p (respectively v). By Theorem 4.1 in [18], there exists at least 
one probability measure 7 for which the infimum is attained. According to the 
celebrated Kantorovitch-Rubinstein duality theorem, cf. Theorem 5.10 in [18], this 
minimum is equal to 

Tp{p,v)= sup / Fd{p-u), (2.2) 

FfEp-Lipi Jx 

where p — Lip^ is the set of bounded Lipschitz continuous functions F from X to 
R with Lipschitz constant m: 

\F{x)- F{y)\<mp{x,y), x,y e X. 

In the context of optimal transportation, Tp is considered as a Rubinstein distance 
since the cost function is already a distance (see for instance the bibliographical 
notes at the end of Chapter 6 in [18]). 

In this paper, we consider the situation where X = Fa is the configuration space 
on a cr-compact metric space A with Borel cr-algebra B{A), i.e., 

Fa = {w C A : w n if is a finite set for every compact K e 6 (A)}. 

Here the a-compactness means that A can be partitioned into the union of coun- 
tably many compact subspaces. We identify w e Fa and the positive Radon 
measure X^j,£(^ex, where Sa is the Dirac measure at point o. Throughout this 
paper. Fa is endowed with the vague topology, i.e., the weakest topology such 
that for all / e Co (A) (continuous with compact support on A), the following 
maps 

fduj = ^f{x) 

are continuous. When / is the indicator function of a subset _B, we will use the 
shorter notation ui{B) for the integral of 1b with respect to ui. We denote by 
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B{Ta) the corresponding Borel a-algebra. Let ^^(A) be the space of positive and 
diffuse Radon measures on B{A) endowed with the corresponding Borel cr-field and 
equipped with the topology of vague convergence. Given a measure a € 9}t(A), the 
probability space under consideration in the remainder of this paper will be the 
Poisson space (Fa, B(rA), Ha), where iia is the Poisson measure of intensity a, i.e., 
the probability measure on Pa fully characterized by 



exp ( J f du?j = cxpj^ (e-'' - 1) do-j , 



for all / e Co (A). Here E^^ stands for the expectation under the measure fi,^. 



2.1. Distances on the configuration space Pa- Actually, several distance 

concepts arc available between elements of the configuration space Pa, cf. for 
instance [17] for a thorough discussion about this topic. We introduce only three 
of them which will be useful in the sequel. Let oj and r] be two configurations in 
Pa. 

Trivial distance: The trivial distance is simply given by 

Total variation distance: The total variation distance is defined as 
Pii^^^v) = -v{{x})\ 

xeA 

= wAr]{A) + r]Aoj{A), 

where ljAt] = ci;\(w fl ry). 
Wasserstein distance: If A = M'^ and k is the Euclidean distance, the 
Wasserstein distance is given by 



p2{co,v) = inf 4/ / / K(a;,y)2 d/3(x,y), 

where T,{uj,ri) denotes the set of configurations ,5 € Pa x a having marginals 
CO and r], see [6, 15]. 

Let us comment on these notions of distance on the configuration space Pa- First, 
the total variation distance pi is nothing but the number of different atoms be- 
tween two configurations. In particular, we allow them to be infinite so that the 
total variation distance might take infinite values. Note that our definition is a 
straightforward generalization of the classical notion of total variation distance 
between probability measures, since it coincides with the usual definition when 
the configurations arc normalized by their total masses. 

As the total variation distance pi, the Wasserstein distance p2 also shares the 
property that it might takes infinite values. Indeed, if the total masses of two con- 
figurations Lo and rj arc finite but differ, then there exists no coupling configuration 
(3 in S(w,77), hence the distance should be infinite. If w(A) = r]{A) < +oo with 
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uj = X^jAi and r] — Xlj=i ^vj > "^^^ ^^^^ write 

u;(A) 

P2{oJ,vf= inf ^ /^(a;J•,y^(J•))^ 

where 6a;(A) denotes the symmetric group on the finite set {1,2, . . . ,oj{A)}. As 
such p2 appears as the dimension-free generahzation of the Euchdean distance. 
In order to use the Kantorovich-Rubinstein duahty Theorem, the lower semi- 
continuity of the distances pi, i S {0, 1,2}, is required. This is the object of the 
next lemma. 

Lemma 2.1. For any i e {0,1,2}, the distance pi is lower semi- continuous on 
the product space Fa x Fa equipped with the product topology. 

Proof. It is immediate for the trivial distance po and it is proved in Lemma 4.1 in 
[15] for the Wasserstein distance p2- To verify this property for the total variation 
distance pi, let a be a real number and consider J„ defined by 

J« = {(w,??) e Fa X Fa : pi{u),r]) < a}. 

Let {{uj„,r]n), n > 1) converge vaguely to {uj,r]) and such that for any n, {ujn,r]„) 
belongs to Jq,. By the triangular inequality, we have for any compact set K and 
any n: 

pl{'KKOJ,'KKri) < Pl{TTKUJ,TrKUJn) + a + pi{TTK'nn,T^Kri), 

where ttk denotes the restriction to if of a configuration. Hence using the vague 

convergence, we obtain that (vr^cj, tt^t;) € Jq. Finally, since the metric space A 
is (7-compact, the monotone convergence theorem for an exhaustive sequence of 
compacts {Kp)p^^ entails that 

pi{u},r])= lim pi{TTK^uj,TrK^r]) <a, 
hence the set is vaguely closed. □ 

Let us mention that Lemma 2.1 entails the lower semi-continuity of the Rubinstein 

distance Tp^, i G {0,1,2}, with respect to the weak topology on the space of 
probability measures on Fa, cf. for instance Remark 6.12 in [18]. In particular, 
since the space 9K(A) is equipped with the vague topology, then the application 
(T i~> /icr is continuous so that the mapping a n> Tpiipa-,!^), i G {0, 1,2}, is lower 
semi-continuous for any given probability measure v on Fa. However for i € {1, 2}, 
the Rubinstein distances Tp^ is not continuous and might be infinite since the 
distance pi is very often infinite itself, as in the Wiener space situation of [9]. 
Actually, we mention that our definitions do not coincide with some of the usual 
definitions of (bounded) distances between point processes, see for instance [2, 
3, 17]. As mentioned above, it is customary to use the classical notion of total 
variation by considering normalized configurations, i.e., 

provided both configurations have finite total masses. It should be noted that since 
pi is not lower semi-continuous, the Kantorovich-Rubinstein duality Theorem is 
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no longer satisfied, so that wc cannot use the identity (2.2) in our framework. For 
instance, let A = R, w = so and rj = Si. Choose a;„ = £o + £« and ??„ = £i + e„. 
As n goes to infinity, w„ and rjn tend vaguely to ui and rj respectively. However, 
we have pi{u},r]) = 2 whereas pi{cOn,iln) = 1, for any integer n > 2. 
It is also customary to replace p2 by p2 defined by 



The normalization by the inverse of uj{A) shrinks the p2 distance by a factor 
roughly equal to the expectation of oj{A)~^, see [6]. More importantly, the term 
|a;(A) — ?7(A)| has no dimension (in the sense of dimensional analysis) whereas the 
term involving p2 has the dimension of a length. Furthermore, the distance p2 
has interesting geometric properties of the space Fa like the Rademacher property 
(see Lemma 2.5 below), not shared by p2. 

2.2. Malliavin derivatives and the Rademacher property. Before intro- 
ducing the so-called Rademacher property on the configuration space Fa, we need 

some additional structure. 

Hypothesis 2.2. Assume now that we have: 

• A kernel Q on Fa x A, i.e. Q{-,A) is measurable as a function on Fa for 
any A G B{A) and Q{cj, •) is a positive Radon measure on B{A) for any 
w € Fa. We set da{u),x) = Q{co, dx) dp^((jj). 

• A gradient/ Malliavin derivative V, defined on a dense subset DomV of 
L'^{fia), such that for any F e DomV, 



i.e., the domain of the gradient is DomV = {F G L'^{pa) : VF G L^{a)}. 

We say that a process u = u{lo,x) belongs to DomS whenever there exists a 
constant c such that for any F G DomV, 



Denote the self-adjoint operator C = dV acting on its domain Dom£ c DomV 
and let {Pt)t>a be the associated Ornstein-Uhlenbeck semi-group, i.e. the semi- 
group whose infinitesimal generator is —C. 

Once the stochastic gradient has been introduced, let us relate it to the geometry 
of the configuration space Fa- 

Definition 2.3. Given a distance p and a gradient V on Fa, we say that the 
couple (V,/?) has the Rademacher property whenever 





JTa J a 

For such a process, we define the operator 6 by duality: 





(2.3) 



p — Lip]^ C DomV 



and 



Va;F(w)| < 1, a-a.e. 



(2.4) 
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To investigate the Rubinstein distance associated to a distance on Fa, it will 
be of crucial importance to find the convenient notion of gradient for which the 
Rademacher property holds. 

Discrete gradient on configuration space. Given a functional F e L'^{ij,„), 
the discrete gradient of F, denoted by V^F, is defined by 

ViF{io) = F{oj + s^)-F{uj), {uj,x)€TaxA. 

In particular, Dom V** is the subspace of L'^{iia) random variables such that 



[ |V»Fp da{x) 

.J A 



< +00. 



We set Q^{lo, dx) = d(T{x) so that a" = ^„ ® a. The n-th multiple stochastic 
integral of a real- valued square-integrable symmetric function /„ S L^(a®") is 
defined as 

Jnifn) = / fn{xi, . . . ,Xn) d(w - (t) (xi ) . . . d{uJ - a){Xn), 

where A„ = {{xi, . . . ,Xn) & A", Xi ^ Xj, i ^ j}. As a convention, we identify 
L2(o-®0) to R and let Jo(/o) = /o, /o e L^{a'^°) ~ M. We have the isometry 
formula 

Ef,AJn{fn)Jm{fm)]=n\l[n=m} f fnfmda®^. (2.5) 

According to [16, 13], the Chaotic Representation Property holds on the configu- 
ration space, i.e., every functional F e L'^{ncr) can be written as 

+00 



n=l 



Moreover, if F e DomV", then the discrete gradient acts on multiple stochastic 
integrals as 



+00 



V|F = nJn-iifni; x)), a«-a.e. 



Denote (5" the adjoint operator of V" in the sense of (2.3). Then the self-adjoint 
number operator jC^ = 5^V^ has the following expression in terms of chaos: 



+00 



£«F = ^nJ„(/„) 



whenever F G Dom^C", and the associated Ornstein-Uhlenbeck semi-group {Pf)t>o 
is given by 

+00 

P«F = E^, [F] + ^e-"*J„(/J. 

71=1 

Hence the invariance property of the Poisson measure fi^ with respect to the 
semi-group reads as E^^^ [P/f] = E^_^ [F]. Moreover, we have the commutation 
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property between gradient and semi-group, which will be useful in the sequel: if 
F e DomVf, 

V|P/i^ = e-*P/v|F, xeA, t>0. (2.6) 

By the isometry formula (2.5), the semi-group is exponentially ergodic in LP'{^,„) 
with respect to the Poisson measure jia, i.e., for any i > 0, 

||P,P-E,jF]||i.(^^) = ^e-2"*E^, [j„(/„)2] 

n>l 

Using the discrete gradient, the distances of interest on Fa are the trivial distance 
po and the total variation distance pi, as illustrated by the following Lemma. 

Lemma 2.4. Assume that the intensity measure a is finite on A. Then the couples 
(V'',po) o,nd i^Kpi) satisfy the Rademacher property (2.4). 

Proof. Letting F & pi — Lip^, i e {0, 1}, we have by the very definition of the 
discrete gradient: 

\ViF{Lj)\=\F{io + e^)-F{io)\<pi{io + e^,u) < 1. 

Since a is finite, it follows that 

/ \ViF{u;)\' da{x) < a(A), 

J A 

hence that F belongs to Dom V'. The proof is achieved. □ 

Note that the converse direction holds for the total variation distance pi. Indeed, 
consider two configurations u) and 77. If pi(w, rj) = +00, there is nothing to prove. 
If pi{uj,r]) is finite, then since |V|F(a;)| < 1, a^-a.e., we get 

|F(?7) - F(w)| < |F(?7 n w U rjAoj) - F{r] n u})\ + \F{r] Hu U coAr]) - F{r] n w)| 
< (7? Aw) (A) + (a;A7?)(A) 

= Piiv,^^)- 

Differential gradient on configuration space. Let us introduce another 
stochastic gradient on the configuration space Fa which is a derivation, see [1, 15]. 

Given the Euclidean space A = R'^, let ^(A) bo the space of C°° vector fields on 
A and Vo(A) C V{A), the subspace consisting of all vector fields with compact 
support. For v G Vo(A), for any x G A, the curve 

t^V^ix) e A 

is defined as the solution of the following Cauchy problem 

^V^x) = viVUx)), ^2.7) 
VS{x) = X. 

The associated flow {V^,t e R) induces a curve {V^)*u) = w o (V^)-'^, t e R, on 

Fa: if o; = J2xei^ then (V")*a; = '^^^^ £v^'(.t)- Wc are then in position to define 
a notion of differentiability on Fa- We take (5^(w, dec) = duj{x) = J2yeoj '^^yi-'^) 
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and da'^(w,x) — duj{x) diiai^^)- A measurable function : Fa ^ R is said to be 
differentiable if for any v G Vb(A), the following limit exists: 

t^o t 

We denote V^F{uj) the preceding quantity. The domain of is then the set of 
integrable and differentiable functions such that there exists a process (w, x) i— >■ 
W%F{(jj) which belongs to L^(a'^) and satisfies 

VSi^(a;) = / VlF{u:)v{x) du;(x). 
J A 

We denote by 5'^ the adjoint operator of in the sense of (2.3). Note that the 
integration in the left- hand- side of the duality formula (2.3) is made with respect 
to a configuration oj, whereas the intensity measure a is involved in the case of 
the discrete gradient. Given the self-adjoint operator = S^W^, the associated 
Ornstein-Uhlenbeck semi-group {P^)t>o is ergodic in L'^{iJ,„) with respect to the 
Poisson measure Hcr, cf. Theorem 4.3 in [1]. However, in contrast to the case of 
the discrete gradient, there is no known commutation relationship between the 
gradient and the semi-group P^. 

The distance we focus on in this part is the Wasserstein distance p2- We have the 
following lemma. 

Lemma 2.5. The couple {V,p2) satisfies the Rademacher property (2.4). 

Proof. The proof is straightforward. Indeed, letting F € p2 — Lipj^, we know from 
Theorem 1.3 in [15] that F G DomV^ and that 

J2 |VSi=^(a;)p = / |VSF(a;)p dw{x) < 1, /x.-a.s. 

X<£LU "'A 

Hence we obtain |V^F(w)| < 1, a'^-a.e., in other words the Rademacher property 
(2.4) is satisfied. □ 



3. Upper bounds on Rubinstein distances 



3.1. An abstract upper bound on Rubinstein distances. Let us establish 
first an abstract upper bound on the Rubinstein distance by using a semi-group 
method, provided the associated couple gradient/distance satisfies the Rademacher 
property (2.4). Denote p a lower semi-continuous distance on the configuration 
space Fa and assume that Hypothesis 2.2 is fulfilled. 

Proposition 3.1. Assume that the couple (V, p) satisfies the Rademacher property 
(2.4)- Let L he the density of an absolutely continuous probability measure v with 
respect to p^ ■ Then provided the inequality makes sense, the following upper hound 
on the Rubinstein distance holds: 



!• !• /• + CXD 

Tp{Pa,v)< / / / y^PMuJ 

Jta J a Jo 



da{uj,x). 



(3.1) 
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Proof. The proof follows the approach emphasized by Houdre and Privault in 
[11] to derive covariance identities and then concentration inequalities. Letting 
F G p — Lipj, we have by reversibility and using Fubini's Theorem: 



/ Fd(M,-z/) = / ([ Fdti,-F)L 



PtF dt L d/i, 



Fa jo 



+ 00 



Pt£F L dt diJa 



p n+00 

= - / 5VFPtL dt dfj,^ 

Jta Jo 



Jta J Jo 



VxPtL dt da{-,x). 



Using then the Rademacher property (2.4), the result holds by taking the supre- 
mum over all functions F £ p — Lip^. □ 

Note that the upper bound in the inequality (3.1) is interesting in its own right, 
but seems to be somewhat difficult to compute in full generality. Hence we turn 
in the sequel to more concrete situations, i.e., when the gradient of interest is the 

discrete gradient V* or the differential one V*^ and is associated to the convenient 
distance pi, i G {0, 1, 2}, in the sense of the Rademacher property (2.4). 

3.2. A qualitative upper bound on Tp^. Once the abstract estimate (3.1) has 
been obtained, one notices that it might be simplified whenever a commutation 
relation between gradient and semi-group holds. To the knowledge of the authors, 

such a property is only verified in the case of the discrete gradient, so that we 
focus in this part on the couple (V'^jPi). Here is one of the two main results of 
the paper. 

Theorem 3.2. LetL be the density of an absolutely continuous probability measure 
u with respect to fi^, and assume that L G Dom V" and V'L S L^^jJLa c). Then 
we get the following estimate: 



/ |V|L| d^{x) 

J A 



(3.2) 



The same inequality also holds under the distance po- 



Proof. Since the case of a general intensity measure a G 9JT(A) might be established 
by a simple limiting procedure (use the (T-compactness of the metric space A and 
the lower semi-continuity of the application a 1— >■ Tp^ {p.^, i^)), let us assume that cr 
is finite, so that the Rademacher property stated in Lemma 2.4 is satisfied by the 
couple (V'',pi). Hence Proposition 3.1 above entails the inequality 



7^1 (/"a 



< E,, 



/ / ' 

Ja Jo 



vipfi dt 



da{x) 
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Using now the commutation relation (2.6), we have 
7^1 (Mct,;^) < 



r- r-+oo 
J A Jo 



dcr(x) 



(3.3) 



< 



= E 



J A Jo 

/ / e-*|V|i| di da(a;) 

J A Jo 

f \ViL\ da(x) 

J A 



= E 



where we have used Jensen's inequahty and the invariance property of the Poisson 
measure ficr with respect to the semi-group Pf . The desired inequality (3.2) is thus 
established. 

Finally, the case of the trivial distance po is similar since the couple (V", po) also 
satisfies the Rademacher property, cf. Lemma 2.4. The proof is achieved in full 
generality. □ 

Actually, the well-known relationship between semi-group and generator states 
that for any G G L'^{fi„), 

P+OO 



/ e-*PfG dt = {Id +C^)-'^G. 
Jo 



Applying then such an identity in the inequality (3.3) above gives the following 
bound: 



/ |(Id-F£'')-V|L| da{x) 

J A 



(3.4) 



It seems theoretically slightly better than the upper bound of Theorem 3.2 but 
often yields to intractable computations, except when the chaos representation of 
L is given, as noticed in Section 4.1 below. Note that the very analog of (3.4) on 
Wiener space was proved by a different though related way in Theorem 3.2 of [9]. 

Let us provide another method leading to Theorem 3.2 which is based on the 
so-called Clark formula. Instead of considering configurations in Fa, the idea is 
to use multivariate Poisson processes, i.e., point processes on [0, 1] with marks in 
the cr-compact metric space A. Borrowing an idea of [19], we first explain how to 
embed a Poisson process into a multivariate Poisson process. 

Let be the Poisson measure of intensity A cr on the new configuration space 
F^, where the enlarged state space is A = [0, 1] x A, and A denotes the Lebesgue 
measure on [0,1]. Any generic element w G F^ has the form Q = ^(^t x)eu) ^^^>^' 
The canonical filtration is defined for any t G [0, 1] as 

dt=(T {w([0, s]x B), 0<s<t, Be B(A)} . 

Let us recall the Clark formula, cf. for instance [7] or Lemma 1.3 in [19], which 
states that every functional G : F^ — )• R belonging to Dom V" might be written as 



G = Ep [G] + 











lo J A 





d(w — \ ® cr)(t, x), 



(3.5) 



12 



LAURENT DECREUSEFOND, ALDERIC JOULIN, AND NICOLAS SAVY 



where V' ^ denotes the discrete gradient on the enlarged configuration space F^. 
For an element w e F^, we define by ttcD its projection on Fa, i.e., 



and given F : Fa 



7rtD(B) = ^([0, 1] X B), S e S(A), 
i, we define the functional F as 
F ■ F- 

F{ttuj) 



■ A 



In particular, we have clearly v\^F{Ld) = V|,F(7ra)) for any {t,x) G A. Moreover, 

we have lEp[F] = E^_^ [F] since the image measure of /x by tt is /x^. 
The total variation distance on F^ is defined as 

(t,x)eA 

The key point is the following lemma. 

Lemma 3.3. For any F G pi — Lip^, the functional F belongs to pi — Lip^. 
Proof. Given F e pi — Lip^, we have for any u},rj G F^: 

\F{uj)-F{Tj)\ = \F{7Tu) - F{wrj)\ 
< pi(7rw,7r^) 



< 



^ ^ Q{{t,x})-fi{{t,x}) 
xeA te[o,i] 

{t,x)eA 
Pi{Q,f]). 



The proof is complete. 



□ 



Now wc are able to give a second proof of Theorem 3.2 by means of the Clark 
formula (3.5) and Lemma 3.3. 

Proof. Letting 9 be the measure with density L with respect to p, we obtain: 
Tp,il^a,i^)= sup E^JF{L-1)] 

Fepi-Lipi 

= sup Ep[F{L - 1)] 

Fepi-Lipi 

= sup Ep[F]-E^[F]. 
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Now using the Clark formula (3.5) and taking expectation with respect to v, 



= Ep[F]+E^ 



f f % 

Jo J A 



L 



"'A 



d{u) — A a){t,x) 
d(a} — A (g) a){t,x) 

vis dt dc7{x) 



where in the second line we also used the Clark formula (3.5) applied to the 
functional L. By Lemma 2.4, the couple (V", pi) satisfies the Rademacher property 
(2.4) on r^. Hence Lemma 3.3 implies that for F € pi — Lipi, the quantity 



is bounded hy 1, p, 

Tp^{Ha,v) < Ep 

The second proof of Theorem 3.2 is thus complete. 



A c7-a.e., so that we obtain finally 

Vf,,L| dt da{x) 
\ViL\ daix) 



□ 



3.3. A qualitative upper bound on Tp2 by time-change. Recall that by 
Lemma 2.5, the couple (V'^,/92) satisfies the Rademacher property (2.4). Hence 

Proposition 3.1 entails an upper bound on the Tp2 Rubinstein distance as follows: 
if L denotes the density of an absolutely continuous probability measure u with 
respect to then we have 



It A Ia 



+00 



WlP,^L{uj) dt 



dLo{x) d/Zo-(u;), 



provided the inequality makes sense. However, despite its theoretical interest, such 
an inequality is not really tractable in practise, since no commutation relation has 
been established yet between the differential gradient V and the semi-group P^^. 
Hence the purpose of this section is to provide another estimate on through a 
different approach relying on a time-change argument together with the Girsanov 
Theorem. 

We consider the notation of Section 3.2 above, with the difference that the state 
space is now A := [0, 00) x A, where A is the space W' equipped with the Euclidean 
distance k. In this part, the distance of interest on the enlarged configuration space 



is the Wasserstein distance: 



P2(w,7?)^= inf 

;8eE(£D,r)) JX J A 



{K{x,yf + \t-s\^) dP{{s,x),{t,y)). 



The following theorem is our second main result. 

Theorem 3.4. Let L be the (positive) density of an absolutely continuous proba- 
bility measure u with respect to /x. Then provided the inequality makes sense, we 
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get the following upper bound on the Rubinstein distance T^{ll,V): 

2 



u{s,z) ds 



J A Jo Jo 

f f~^°° _ 2 

L \r — v~^{r,z)\ dr da{z) 

. JkJo 



{l + u{t,z)) dt da{z) 



(3.6) 



where u{t,z) > —1 is the following predictable process: 



E 



u{t, z) 



v{t^z):=t+ I u{s,z)ds, z e A, 



and v~^{-,z) is the inverse of the increasing mapping t ^ v{t,z). 

Note that for z G A fixed, the term /q^°° — v^^{r, z)|^ dr ean be interpreted 
as a generalized Wassertein distance between tlie infinite measures dr and (1 + 
w(r, z)) dr, see [18]. Then, the 7^ distance is bounded from above by the expecta- 
tion under P of this generalized distance integrated over A according to the marks 
distribution. 

Proof. By the Girsanov Theorem, there exists a predictable process u such that 
for any compact set K G B(A), the process 



t Lj{[0,t] X K)- 



(1 + u{s, z)) ds da{z), 



JK 



is a i/-martingale. Moreover, the conditional expectation Lt := E [L\^t] might be 
identified as follows: 



exp 



JA 



= 1 + 



\n(l + u{s, z)) du){s, z) — / / u{s, z) ds da{z) 

Jo J A 

^ (^J J u{s,z) d{LJ — a){s,z)J 

Lg-u{s, z) d(a} — \ ® cr){s, z), 



JA 



where £ denotes the classical Doleans-Dade exponential. On the other hand, the 
Clark formula (3.5) extended to the set (0, +00) induces that 

Lt = l+ f [ E[VlMds-] d{u)-X(3a)is,z). 

Jo J A 

By identification, we obtain: 

= lT- = lT- ' 

since for any s G (0, t) a commutation relation holds between the discrete gradient 
Vj _j, and the conditional expectation knowing S^t, cf. for instance Lemma 3.2 in 
[13]. Define on the time-change configuration tQ by 

(ti,2i)ew 
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where v(t, z) is given above. By Theorem 3 in [5], the distribution of tuj under u is 
nothing but the law of the configuration w under ju. Hence using Cauchy-Schwarz' 
inequahty in the second line below, we obtain: 



7s5(m,2?) < Ep[p2(w,™)] 



\t-v{t,z)\'^ da;(i,^) 



A Jo 



1/2 



/. r+co 
J A Jo 



t - v{t, z)f -^{t, z) dt da{z) 



1/2 



where we used the classical compensation formula for stochastic integrals with 

respect to Poisson random measures. Finally, the change of variable r = v{t,z) 
for z e A being fixed allows us to obtain the desired inequality (3.6). □ 

4. Applications 

4.1. Distance estimates between processes. The purpose of the present part 
is to apply our main results Theorems 3.2 and 3.4 to provide distance estimates 
between a Poisson process and several other more sophisticated processes, such as 
Cox or Gibbs processes. See for instance the pioneer monograph [3] or also [2, 17] 
for similar results with respect to another (bounded) distances on the configuration 
space Fa. The three first examples below rely on the total variation distance p\, 
whereas in the last one the Wasserstein distance p2 is considered. 
Poisson processes. Here the probability measure ly is supposed to be another 
Poisson measure on Fa, where A is a cr-compact metric space. 

Proposition 4.1. Let pr be a Poisson measure on Fa of intensity t. We assume 
that T admits a density p with respect to a such that p—1 G L^{a). Then we have 



Tp,{pa,IJ'T) < / \pix)-l\ d£r{x). 
J A 



(4.1) 



Proof. Since pr is a Poisson measure on Fa of intensity r, it is well known that it 
is absolutely continuous with respect to ji^ and the density L is given by 



L{uj) = exp < / logp(a;) du{x) + / (1 — p{x)) da{x) 

VJa Ja 

It is then straightforward that V|Z/ = L{p{x) — 1), hence by Theorem 3.2, 



Tp^iHa^Hr) < Ep^ 

The proof is achieved. 



L [ \p{x) - 1| da{x) = [ \p{x) - 1| d(j{x). 

J A \JA 



□ 



Note that in this very simple situation, the inequality (3.4) yields the same bound. 
Indeed, since p is deterministic, the density L has the following chaos representa- 
tion 



n=l 
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cf. identity (7) in [16], so that we have 



((id+£«)-Mi = (pw-i; 



E 



^ (n-l)! 



J„_i((p-l)^"-i) =(p(x)-l)L. 



Actually, one might obtain the inequality (4.1) by using another very intuitive 
approach. Indeed, let wq, wi and a;2 be three independent configurations in Fa 
with respective intensities 

d(To := (p A 1) d(j, (Ji := (T — (Tq, 02'-= t — (Jq. 
Then wg + Wi and wq + W2 have respective distribution ii^, and /x,-. Hence we have 
Tpi{lJ.a,I^T) = inf{E[pi(w,u;)] 

< E[pi{uJo + u>i,uJo + UJ2)] 
= E[(wi+a;2)(A)] 

= / \p{x) — 1| da{x). 
Ja 

Cox processes. A Cox process is a Poisson process with a random intensity. 
To construct a Cox process, we need to enlarge our probability space. Recall 
that 2)t(A) is the space of positive and diffuse Radon measures on A endowed 
with the vague topology and the corresponding Borel cr-field. Given an arbitrary 
probability measure Pm on S!Jt(A), we denote by M the canonical random variable 
on (2)t(A),PM), i.e. M given by M{m) = m has distribution Pm. On the space 
Fa X dK{K), we consider the probability measures 

diJLj^{ui,m) := d/Xm(w) dPM(?Ti) and d/i^(w,m):= d/L(cr(w) dPM(?Ti). 

Note that the second one is the distribution of the independent couple {N,M), 
where N is the canonical random variable on Fa with distribution 
As noticed in Section 2.1, the application m 1— >■ 7^^ (/Xm, /Xo-) is lower semi-conti- 
nuous, hence measurable. The distribution on Fa is said to be Cox whenever 
for any function / € Co(A), 



cxp 



/ doj 



M 



exp { I (eJ - 1) dM 



In the definition of the distance between /x^ and /x^, we do not include any 
information on M, so that the distance pi remains the same and we have: 



sup 

Fepi-Lipl 



rAXOT(A) 



rAXS!)I(A) 



= sup / ( / F{uj) d{na - /Um)(w) ) dPM(m). 

FGpi-Lipi Jot(A) \JrA / 

Proposition 4.2. Assume that ^'^-a.s., the measure M is absolutely continuous 
with respect to a and that there exists a measurable version of dM/ da and such 
that dM/ cfcr — 1 e L^ifJ-'cr ® <^)- Then we have 

dM 



da 



ix)-l 



da{x) 
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Proof. We have: 



Tp,{lJ-'a,IJ''M) < / sup ( / F(a;) d(/ia - A*m)(w) ) dPM(m) 

dm 



L 



< 



m{A) 



OT(A) J A 



da 



(x)-l 



dcr(a;) dPM(m), 



where the last inequahty follows from Proposition 4.1. 



□ 



Gibbs processes. Let A = M*^ and assume that the measure v is a. Gibbs measure 
on Fa with respect to the reference measure /U^, i.e. the density of p with respect 
to Ha is of the form L = , where 

V{lo) ■= 4'{x — y) du){x) du){y) < +00, i^a — a.s., 

J A J A 

and where the potential (j) : A ^ {0, +00) is such that <p{x) = 4>{—x) and 
/ / (f){x - y) d<j{x) dcr(y) < +00. 

J A J A 

We have the following result. 

Proposition 4.3. The Rubinstein distance Tpi between the Poisson measure /j.^ 
and the Gibbs measure v is bounded as follows: 



7^i(M<T,i^) < 2 / / 4i{x - y) du{x) d/T{y). 

J A J A 



V|L(w) = -L{w) (^1 - exp |-2 ^ <p{x - y) du{y) 



Proof. Since V is /Xo-a.s. finite, so does (j){x — y) dijj{y) for any x. We have: 

, X e A. 

Since < L < 1, Theorem 3.2 together with the inequality 1 — e~" < u imply: 

L [ 2 [ cl){x-y) duj{y) da{x) 

. J A J A 

r / I 4>{x-y) du:{y) du{x) 

J A J A 

= 2 [ [ Hx-y) da{x) da{y). 

J A J A 



L l^i^l- exp I -2 (l){x - y) dw(y) ) ] da{x) 



< 2E„ 



The proof is complete. 



□ 
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Poisson processes on the half-line. In this example, we give a bound on the 
Rubinstein distance between Poisson processes, with respect to the Wasserstein 
distance p2. Consider to simphfy Poisson processes on R+ (the generahzation 
to multivariate Poisson processes is straightforward). Letting U : M+ — >■ M be 
a continuously differentiable function vanishing at infinity and with f7(0) = 0, 
we also assume that U e ^^(A), where A is the Lebesgue measure, and that its 
derivative U' is valued in (— l,+oo). A typical example of such a function is 
U{t) = t/{l + 1^), t > 0. Then we obtain by Theorem 3.4 the following result. 

Proposition 4.4. Let iJ,x be the Poisson measure of Lebesgue intensity A on the 
configuration space Fk,^, and consider the Poisson measure v of intensity (1 + 
dX. Then we have the upper bound on 7^2 (a^a, t^): 

Tp^{n\,v) < \\U\\l2(^x)- 

4.2. Tail and isoperimetric estimates. The aim of this final part is to de- 
rive several consequences of Theorem 3.2 above in terms of tail estimates and 
isoperimetric inequalities. 

Tail estimates. Our main result Theorem 3.2 allows us to obtain a first tail 
estimate as follows. Let F G pi — Lip^ be centered and let A > 0. Denote 
Zx = E^^ [e"^^] and consider the absolutely continuous probability measure 
with density e^^ /Z\ with respect to /x^. Using a somewhat similar argument as 
in [11], we have: 

-^logZA = [ Fdu^ 



< J/ \Vie^^\da{x) 

Lja 

< (e^-l)||V«F||i,oo, 



where in the last inequality we used the fact that the function x h- >■ (e^ — l)/x is 
non-decreasing on (0, +00). Here the notation llV^Fjli^oo stands for 

||V«F||i,oo := - esssup / |V|F| da{x). 

Ja 

Hence we obtain the following bound on the Laplace transform: 

E^^ [p/^] =ZA<exp{||V«F||i,oo(e^-A-l)}, A > 0. 

Finally using Chebychev's inequality, we get the deviation inequality available for 
any r > 0: 



P. {F>r)< exp |r - (r + ||V»F||i,oo) log [1 + ^J^^ J | • (4-2) 

Note that such a tail estimate is somewhat similar to that established for instance 
by Wu and Houdre-Privault in [19, 11]. However, in contrast to their results, we 
do not exhibit at the denominator the sharp variance term 

l|V«F||2 :=/.,- esssup / \ViF\^ da{x), 

J A 



RUBINSTEIN DISTANCES ON CONFIGURATION SPACES 



19 



since our method relies on the L^-inequahty (3.2). In particular, if we apply (4.2) 
for instance to the centered function F G pi — Lip^ given by F(w) = {u! — a){K), 
where K is some compact subset of A, we obtain the inequality 

Ma {cj{K) > a{K) +r)< e'^-(^+^(^)) . 

Unfortunately, neither (4.2) nor the results emphasized in [19, 11] arc sharp in 
terms of the deviation level r since the following asymptotic estimate holds, cf. for 
instance p. 1225 of Houdrc [10]; 

lia{i^{K)>a{K)+r) = {to{K) >[a{K) + r\) 

^[a(K)+r]-a{K)-[a(K)+r] log(M^) 



r— )-+oo 



where [TC\ := infjA'' G N^, : N > R} denotes the upper integer part of any positive 
real number R. Hence the purpose of this part is to recover this multiplicative 
polynomial factor by means of a simple use of Theorem 3.2. We proceed as follows. 
Let v be the absolutely continuous probability measure with density with respect 
to fj,a- 

Ha {i^{K) > [a{K) + r\) ^ '-^ ^ ' " 
Using Theorem 3.2, we compute as follows: 

Ha {io{K)>a{K)+r) 

= Ha {uj{K)>[a{K)+r]) 

- [a{K]+r] ii ""^^^ ^^^^ '^t^^i^)) (w(^) > HK) + r]) 

- [a{K)+r] i^'' ^"^^ + '"^^O ^^^^^ - '"^^^ ^ ''^^ 
1 

[a{K)+r] 
so that we obtain 



/ |ViL| da{x) 

J A 



a{K)]Ha {uj{K)>[a{K)+r\) 



Ha {uj{K) = [a{K) +r]-l)+Ha {oj{K) > [a{K) + r]) 



^ [aiK)+r] a^^^ a(j^)[-(^)+-l 
r [a{K)+r]\ ' 

Hence using the lower bound below on the factorial function of any positive integer 
N, cf. for instance [8]: 

v^AT^+le-^ < iV! < v^iV^+^e-^+T*^, (4.3) 
we obtain the following result. 
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Proposition 4.5. Given any compact set K d A and any r > 0, we have the tail 
estimate: 



tia {to{K) > a{K) + r) < 



[a{K)+r] e['"(^)+''l-''('^)-['^(^)+''l 



'i W) ) 



^2T,[u{K)+r] 



To the knowledge of the authors, although the latter non- asymptotic tail estimate 
is straightforward to establish via Theorem 3.2 as we have seen above, it seems to 
be new and recovers exactly the asymptotic regime emphasized above. Note that 
Paulauskas obtained a somewhat similar deviation inequality in Proposition 3 in 
[14], but with a constant which is however not sharp, in contrast to ours. 

Now we aim at extending this tail estimate to a more general context. Given a 
fixed configuration 77 G Fa, we provide in the sequel a deviation inequality from 
its mean of the total variation distance pi between rj and random configurations. 
Assume that cr is a finite measure. Denoting the function := pi{-,ri) which 
clearly belongs to the set pi — Lip^ and using the same argument as above, we 
have 

Pa {Pn > [pr,] + r) 

= Pa [fir, > [E^,, [Pr,] + r]) 
1 



< 



< 



< 



1 

[(t(A) + r] - r 



E,x„ [Pv'^{p^>lE^APv]+r]}] 



E 



l^il{p.>[E^„b.]+r]}l Mx) 

+E^<, [Pr,] Pa {Pn > [E^, [pr,] + r]) 
Pa {Pr, > W-fi^ [Pv] +r -1]) - Pa{Pr,> [E^„ [Pr,] + r])^ 



+ 



■ ^l^a [pr,] P<T (Pf > [Ea»<, [Pr,\ + ^]) : 



[Pr,]+r] 

since the intensity measure a is diffuse. Hence we obtain for any r > 0: 

\a(A) + r] — r 

Pa {Pr, > [E^„ [Pn] + r]) < ^ Pa {Pr, > [Pv] + ^ - 1]) , 

and iterating the procedure entails the inequality 

Pa {Pv > [E^„ [Pn] +r])< [a{A)+r]\ ' 

Finally using the estimates (4.3) yield the following result. 

Proposition 4.6. Given any fixed configuration 77 € Fa and provided the intensity 
measure a is finite, we have for any r > 0: 

Pa {Pv > E^,, [prj] + r) 

V27r[a(A)][a(A)][-(A)]eTOT gk(A)+r]-k(A)]-[a(A)+r] log( ij-ffj+)1J 



< 



a(A)'^(A) 



^/2n[a{A) + r] 
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where pri denotes the total variation distance pi(-,?7). 



Hence one deduces that the tail behavior of the total variation distance is com- 
parable to the previous ones, up to constant multiplicative factors depending on 
the total mass o'(A). 

Isoperimetric inequality. Here the distance of interest is the trivial distance 
pQ. In the sequel, we assume that the intensity measure a is finite, so that the 

domain DomV** contains the indicator functions l/i, A G B(rA). 
Given a Borel set A G B{Ta), we define its surface measure as 



fx^idA) := 



7 |V|: 
Ja 



da{x) 



Denote /ij^^ the classical isoperimetric constant that we aim at estimating: 

t^AdA) 



h,j 



inf 



By the following co-area formula, available for any F G Dom : 



|V|F| da{x) 



p P + OO 

/ / |v|i{^>t}| di Mx) 

J A J -oo 



which might be deduced from the identity |a — 6| ~ |l{a>t} ~ l{5>t}| dt, the 
constant /i^^ is also the best constant h in the L^-type functional inequality 



|V|F| da{x) 



F e DomV". 



(4.4) 



We have the following result, which is convenient for small total mass cr(A). 
Proposition 4.7. Assume that the measure a is finite. Then we have 

aiA) 



1 < h. 



< 



1 - e-^W ■ 

In particular, we have the asymptotic for small total mass: 



(4.5) 



lim hn 

<t(A)^0 



1. 



Note that Houdre and Privault established first the inequality h^^ > 1 by using 
Poincare inequality, cf. Proposition 6.4 in [12]. Hence we recover their result via 
another approach. On the other hand, our estimate in the right-hand-side of (4.5) 
is sharp for small values of a'(A), but is worse than their estimate for large cr(A) 
since their upper bound is 8 -|- 8-\/cr(A). 



Proof. In order to show /i^^ > 1, let us establish the inequality (4.4) with h = 1. 

By homogeneity, it is sufficient to prove the result for functionals F £ Dom V" 
such that E^^ [F] = 1. Denote by v the absolutely continuous probability measure 
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with density F with respect to the Poisson measure fia- Using duahty, 
Tp^{lia,y) = sup E^JG(i^-l)] 

Gepo-LiPl 

= \ sup E^JG(F-l)] 

^ /itr — esssup I G I < 1 

= \k,a\f-i\]. 

Hence using Theorem 3.2 with the trivial distance po, we get the inequahty (4.4) 
with /i = 1, thus obtaining the desired inequahty /i^^ > 1. On the other hand, to 
provide the upper bound in (4.5), note that we have by the very definition of h^^ : 

2m^(9MA) = 0}) 
- m.('^(A) = 0) (1 - /x.(c^(A) = 0)) 

1 - e-'^(^) ■ 

The proof is achieved. □ 
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